To develop a program that, generates a Decision Tree via ID3 Algorithm with the given collection of Data.
A decision tree is a rooted, directed tree akin to a flowchart. Each internal node corresponds to a partitioning decision, and each leaf node is mapped to a class label prediction. To classify a data item, we imagine the data item to be traversing the tree, beginning at the root. Each internal node is programmed with a splitting rule, which partitions the domain of one (or more) of the data’s attributes. Based on the splitting rule, the data item is sent forward to one of the node’s children.This testing and forwarding is repeated until the data item reaches a leaf node.
ID3 is a nonincremental algorithm, meaning it derives its classes from a fixed set of training instances. An incremental algorithm revises the current concept definition, if necessary, with a new sample. The classes created by ID3 are inductive, that is, given a small set of training instances, the specific classes created by ID3 are expected to work for all future instances. The distribution of the unknowns must be the same as the test cases. Induction classes cannot be proven to work in every case since they may classify an infinite number of instances
R-programming has been specifically used for generating the Decision trees using ID3 Algorithm.The training sets are read as dataframe into the global environment.
dataTraining <- read.csv("C:/Krishna/ML-622/Project3/contact-lenses.data", sep="")
#dataTraining <- read.csv("C:/Krishna/ML-622/Project3/fishing.data", sep="")
dataTraining
## A. Age Prescription Astigmatism TearRate Oracle
## 1 D: young myope no reduced none
## 2 D: young myope no normal soft
## 3 D: young myope yes reduced none
## 4 D: young myope yes normal hard
## 5 D: young hypermetrope no reduced none
## 6 D: young hypermetrope no normal soft
## 7 D: young hypermetrope yes reduced none
## 8 D: young hypermetrope yes normal hard
## 9 D: pre-presbyopic myope no reduced none
## 10 D: pre-presbyopic myope no normal soft
## 11 D: pre-presbyopic myope yes reduced none
## 12 D: pre-presbyopic myope yes normal hard
## 13 D: pre-presbyopic hypermetrope no reduced none
## 14 D: pre-presbyopic hypermetrope no normal soft
## 15 D: pre-presbyopic hypermetrope yes reduced none
## 16 D: pre-presbyopic hypermetrope yes normal none
## 17 D: presbyopic myope no reduced none
## 18 D: presbyopic myope no normal none
## 19 D: presbyopic myope yes reduced none
## 20 D: presbyopic myope yes normal hard
## 21 D: presbyopic hypermetrope no reduced none
## 22 D: presbyopic hypermetrope no normal soft
## 23 D: presbyopic hypermetrope yes reduced none
## 24 D: presbyopic hypermetrope yes normal none
Taining data been read into DataFrame
uniqueOracle<- unique(dataTraining$Oracle)
Unique Oracle data has been factored
Entropy<-function()
{
distCountVec<-vector(mode = "numeric")
probdist<-vector(mode = "numeric")
for(j in 1:length(uniqueOracle))
{
sumDistinct=0
for(i in 1:nrow(dataTraining))
{
if(dataTraining$Oracle[i]==uniqueOracle[j])
sumDistinct= sumDistinct+1
}
distCount=sumDistinct
distCountVec<-append(distCountVec,distCount)
probdist<- append(probdist,(distCount/nrow(dataTraining)))
}
probVec<- sapply(probdist, function(x) (x)*log2(x))
sum <- sum(probVec)
return(-sum)
}
Global Entropy has been Calculated.
getDataSet<- function(i,j,dataTraining){
dataTraining[dataTraining[j] == toString(unique(dataTraining[[j]])[i]) ,]
}
Method to get the dataset on every factorization.
atomicEntropy<- function(atomicData,entropySum){
dataSetProb<-apply(atomicData,2,function(x) x/sum(x))
dataSetlog<-apply(dataSetProb,2,function(x) (x)*log2(x))
dataSetlog<- replace(dataSetlog,is.nan(dataSetlog),0)
dataSetEntropy<- apply(dataSetlog,2,function(x) -sum(x))
return(dataSetEntropy)
}
Method to get the atomic entropy at ech level to get the Information Gain at each level to find the node.
EntropyS<- Entropy()
infoGainVector<- vector(mode = "numeric")
for(i in 2:(ncol(dataTraining)-1)){
nodedataSetTemp<-table(dataTraining$Oracle,dataTraining[[i]])
nodemargintemp<- margin.table(nodedataSetTemp, 2)
entropySum<- vector(mode = "numeric")
nodedataSetEntropy<-atomicEntropy(nodedataSetTemp)
nodedataSetEntropy<-replace(nodedataSetEntropy,is.nan(nodedataSetEntropy),0)
sum=0
for(i in 1:length(nodedataSetEntropy)){
sum= sum+ nodedataSetEntropy[i]*(nodemargintemp[i]/sum(nodemargintemp))
}
nodeGainInfo<- EntropyS-sum
#print(GainInfo)
infoGainVector<- append(infoGainVector,nodeGainInfo)
}
NodeColumn<- which.max(infoGainVector)+1
nodeTabletemp<- table(dataTraining$Oracle,dataTraining[[NodeColumn]])
nodeSetEntropy<-atomicEntropy(nodeTabletemp)
nodeSetEntropy<-replace(nodeSetEntropy,is.nan(nodeSetEntropy),0)
dataTraining[[1]]<-NULL # First Column has been deleted
NodeColumn<- NodeColumn-1
Root Node has been calculated
AlgorithmDecision<- function(node,newdataTraining,nodeSetEntropy){
##restricts the data when no attributes are left
if(ncol(newdataTraining)<=1){
print(unique(newdataTraining$Oracle))
}
else
{
##Looping to all the attributes of the rootNode
for(i in 1:length(unique(newdataTraining[[node]]))){
nodeEntropy<-EntropyS
truncData<-getDataSet(i,node,newdataTraining)
# Factored Data has been generated
print(names(truncData)[node])
print(unique(truncData[[node]]))
truncData[[node]]<-NULL
newDatatrain<-truncData
if(length(unique(newDatatrain$Oracle))<=1){
print(unique(newDatatrain$Oracle))
return
}
else{
print("Next Node........")
GainVector<- vector(mode = "numeric")
#Looping to all the columns of the factored dataset
for(i in 1:(ncol(newDatatrain)-1)){
dataSetTemp<-table(newDatatrain$Oracle,newDatatrain[[i]])
margintemp<- margin.table(dataSetTemp, 2)
entropySum<- vector(mode = "numeric")
dataSetEntropy<-atomicEntropy(dataSetTemp)
dataSetEntropy<-replace(dataSetEntropy,is.nan(dataSetEntropy),0)
sum=0
for(i in 1:length(dataSetEntropy)){
sum= sum+ dataSetEntropy[i]*(margintemp[i]/sum(margintemp))
}
GainInfo<- nodeEntropy-sum
GainVector<- append(GainVector,GainInfo)
}
if((length(GainVector)==0)||(length(GainVector)==1)){
print(unique(newDatatrain[[1]]))
print(unique(newDatatrain$Oracle))
return
}
else{
nodeElement<- which.max(GainVector)
nodeTabletemp<- table(newDatatrain$Oracle,newDatatrain[[i]])
nodeSetEntropy<-atomicEntropy(nodeTabletemp)
nodeSetEntropy<-replace(nodeSetEntropy,is.nan(nodeSetEntropy),0)
AlgorithmDecision(nodeElement,newDatatrain,nodeSetEntropy)
}
}
}
}
}
The Method AlgorithmDecision has slightly modified version of ID3 Algorithm to get the decision trees. The Information gain has been slightly modified, where Entropy(S) has been the global entropy. The global entropy has been chosen to make computation easier, because teh information gain is calculated to find the maximum amon the calculated nodes to find the next node.
AlgorithmDecision(NodeColumn,dataTraining,nodeSetEntropy)
## [1] "TearRate"
## [1] reduced
## Levels: normal reduced
## [1] none
## Levels: hard none soft
## [1] "TearRate"
## [1] normal
## Levels: normal reduced
## [1] "Next Node........"
## [1] "Astigmatism"
## [1] no
## Levels: no yes
## [1] "Next Node........"
## [1] "Age"
## [1] young
## Levels: pre-presbyopic presbyopic young
## [1] soft
## Levels: hard none soft
## [1] "Age"
## [1] pre-presbyopic
## Levels: pre-presbyopic presbyopic young
## [1] soft
## Levels: hard none soft
## [1] "Age"
## [1] presbyopic
## Levels: pre-presbyopic presbyopic young
## [1] "Next Node........"
## [1] myope hypermetrope
## Levels: hypermetrope myope
## [1] none soft
## Levels: hard none soft
## [1] "Astigmatism"
## [1] yes
## Levels: no yes
## [1] "Next Node........"
## [1] "Prescription"
## [1] myope
## Levels: hypermetrope myope
## [1] hard
## Levels: hard none soft
## [1] "Prescription"
## [1] hypermetrope
## Levels: hypermetrope myope
## [1] "Next Node........"
## [1] young pre-presbyopic presbyopic
## Levels: pre-presbyopic presbyopic young
## [1] hard none
## Levels: hard none soft
Output for “Fishing Data”
[1] “Forecast” [1] Sunny Levels: Cloudy Rainy Sunny [1] “Next Node is……..” [1] “Wind” [1] Strong Levels: Strong Weak [1] Yes Levels: No Yes [1] “Wind” [1] Weak Levels: Strong Weak [1] “Next Node is……..” [1] “Water” [1] Warm Levels: Cold Moderate Warm [1] No Levels: No Yes [1] “Water” [1] Cold Levels: Cold Moderate Warm [1] No Levels: No Yes [1] “Water” [1] Moderate Levels: Cold Moderate Warm [1] Yes Levels: No Yes [1] “Forecast” [1] Cloudy Levels: Cloudy Rainy Sunny [1] Yes Levels: No Yes [1] “Forecast” [1] Rainy Levels: Cloudy Rainy Sunny [1] “Next Node is……..” [1] “Air” [1] Warm Levels: Cool Warm [1] “Next Node is……..” [1] “Wind” [1] Strong Levels: Strong Weak [1] Yes Levels: No Yes [1] “Wind” [1] Weak Levels: Strong Weak [1] No Levels: No Yes [1] “Air” [1] Cool Levels: Cool Warm [1] No Levels: No Yes
The program would have been much effective if the display content would have been produced instead of displaying nodes. However, the analysis of the nodes have been made and produced as decision tree down
library(data.tree)
## Warning: package 'data.tree' was built under R version 3.2.3
ForeCast<- Node$new("Forecast")
Sunny<-ForeCast$AddChild("Sunny")
Cloudy<- ForeCast$AddChild("Cloudy")
Rainy<-ForeCast$AddChild("Rainy")
SunnyWind<-Sunny$AddChild("Wind")
SunnyWeakWind<- SunnyWind$AddChild("Weak")
SunnystrongWind<- SunnyWind$AddChild("Strong")
SunnyWeakWater<- SunnyWeakWind$AddChild("Water")
SunnyWeakWater$AddChild("Warm")$AddChild("No")
SunnyWeakWater$AddChild("Cold")$AddChild("No")
SunnyWeakWater$AddChild("Moderate")$AddChild("Yes")
Cloudy$AddChild("Yes")
RainyAir<- Rainy$AddChild("Air")
RainyAirCool<- RainyAir$AddChild("Cool")
RainyAirCool$AddChild("No")
RainyAirWarm<- RainyAir$AddChild("Warm")
RainyAirWind<- RainyAirWarm$AddChild("Wind")
RainyAirWind$AddChild("Strong")$AddChild("Yes")
RainyAirWind$AddChild("Weak")$AddChild("No")
print(ForeCast)
## levelName
## 1 Forecast
## 2 ¦--Sunny
## 3 ¦ °--Wind
## 4 ¦ ¦--Weak
## 5 ¦ ¦ °--Water
## 6 ¦ ¦ ¦--Warm
## 7 ¦ ¦ ¦ °--No
## 8 ¦ ¦ ¦--Cold
## 9 ¦ ¦ ¦ °--No
## 10 ¦ ¦ °--Moderate
## 11 ¦ ¦ °--Yes
## 12 ¦ °--Strong
## 13 ¦--Cloudy
## 14 ¦ °--Yes
## 15 °--Rainy
## 16 °--Air
## 17 ¦--Cool
## 18 ¦ °--No
## 19 °--Warm
## 20 °--Wind
## 21 ¦--Strong
## 22 ¦ °--Yes
## 23 °--Weak
## 24 °--No
plot(ForeCast)
TearRate<- Node$new("TearRate")
Reduced<- TearRate$AddChild("Reduced")
Normal<- TearRate$AddChild("Normal")
Reduced$AddChild("None")
NormAst<-Normal$AddChild("Astigmatism")
NormAstno<- NormAst$AddChild("No")
NormAstyes<- NormAst$AddChild("Yes")
NormAstnoAge<- NormAstno$AddChild("Age")
noAgeYng<- NormAstnoAge$AddChild("Young")$AddChild("Soft")
noAgepre<- NormAstnoAge$AddChild("Pre-Presbyopic")$AddChild("Soft")
noAgepres<- NormAstnoAge$AddChild("Presbyopic")
noAgepresSight<- noAgepres$AddChild("Prescription")
noAgepresSight$AddChild("Myope")$AddChild("None")
noAgepresSight$AddChild("HyperMetrope")$AddChild("Soft")
YesPres<- NormAstyes$AddChild("Prescription")
YesPresMyope<- YesPres$AddChild("Myope")$AddChild("Hard")
YespresHyper<- YesPres$AddChild("HyperMyope")
YesAge<- YespresHyper$AddChild("Age")
YesAge$AddChild("Young")$AddChild("Hard")
YesAge$AddChild("Pre-Presbyopic")$AddChild("None")
YesAge$AddChild("Presbyopic")$AddChild("None")
plot(TearRate)
The correctness fishing data has been verified as a known example. However, Unknown example has been Contact-lenses data, Tear Rate being the root node has been making some sense, as the contact lenses initially required for such conditions.The down to leaf decision tree looks pretty much interesting and should make sense in predicting the data.